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Important notice 
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http://www.etsi.org 

The present document may be made available in more than one electronic version or in print. In any case of existing or 

perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF). 

In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive 

within ETSI Secretariat. 

Users of the present document should be aware that the document may be subject to revision or change of status. 

Information on the current status of this and other ETSI documents is available at 

http://portal.etsi.org/tb/status/status.asp 

If you find errors in the present document, please send your comment to one of the following services: 

http://portal.etsi.org/chaircor/ETSI support.asp 

Copyright Notification 

No part may be reproduced except as authorized by written permission. 
The copyright and the foregoing restriction extend to reproduction in all media. 

© European Telecommunications Standards Institute 2004. 
All rights reserved. 

DECT™, PLUGTESTS™ and UMTS™ are Trade Marks of ETSI registered for the benefit of its Members. 
TIPHON™ and the TIPHON logo are Trade Marks currently being registered by ETSI for the benefit of its Members. 
3GPP™ is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. 
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Intellectual Property Rights 



IPRs essential or potentially essential to the present document may have been declared to ETSI. The information 
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found 
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in 
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web 
server ( http://webapp.etsi.org/IPR/home.asp ). 

Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee 
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web 
server) which are, or may be, or may become, essential to the present document. 



Foreword 

This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). 

The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or 
GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. 

The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under 
http ://webapp . etsi.org/kev/queryform. asp . 
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Foreword 



id , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

x the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 
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Scope 



The present document contains an electronic copy of the ANSI-C code for DSR Extended Advanced Front-end. The 
ANSTC code is necessary for a bit exact implementation of DSR Extended Advanced Front-end. 



References 



The following documents contain provisions which, through reference in this text, constitute provisions of the present 
document. 

[1] ETSI ES 202 050: "Distributed Speech Recognition; Advanced Front-end Feature Extraction 

Algorithm; Compression Algorithm", Oct 2002. 

[2] ETSI ES 202 212 "Distributed Speech Recognition; Extended Advanced Front-end Feature 

Extraction Algorithm; Compression Algorithm, Back-end Speech Reconstruction Algorithm", 
Nov 2003. 

[3] 3GPP TS 26. 177: "Speech Enabled Services (SES); Distributed Speech Recognition (DSR) 

extended advanced front-end test sequences". 



3 Definitions and abbreviations 

3.1 Definitions 

Definition of terms used in the present document, can be found in [1], [2] 

3.2 Abbreviations 

For the purpose of the present document, the following abbreviations apply: 

ANSI American National Standards Institute 

I/O Input/Output 

RAM Random Access Memory 

ROM Read Only Memory 

AFE Advanced Front-end 

X-AFE eXtended Advanced Front-end 

DSR Distributed Speech Recognition 



C code structure 



This clause gives an overview of the structure of the bit-exact C code and provides an overview of the contents and 
organization of the C code attached to this document. 

The C code has been verified on the following systems: 

Sun Microsystems workstations and GNU gcc compiler 

IBM PC compatible computers with Linux operating system and GNU gcc compiler. 
ANSI-C was selected as the programming language because portability was desirable. 

4.1 Contents of the C source code 

The distributed files with suffix "c" contain the source code and the files with suffix "h" are the header files. 
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Makefiles are provided for the platforms in which the C code has been verified (listed above). 



4.2 Program execution 



There are separate executables for the FrontEnd and Vector Quantization, with and without Extensions. The command 
line options are described below. 

<> - indicates parameters for the given option for running the executable 
() - indicates default parameter. 

FrontEnd w/ Extension: 

USAGE: bin/ExtAdvFrontEnd infile HTK_outfile pitch_outfile class_outfile [options] 

OPTIONS: 

-q Quiet Mode (FALSE) 

-F format Input file format <NIST,HTK,RAW> (NIST) 

-fs freq Sampling frequency in kHz <8,16> (8) 

-swap Change input byte ordering (Native) 

-noh No HTK header to output file (FALSE) 

-nocO No cO coefficient to output feature vector (FALSE) 

-nologE No logE component to output feature vector (FALSE) 

-skip_header_bytes n - Skip header, first n bytes ( Only for -F RAW) 

-noh, -nocO, -nologE and -skip_header_bytes are not used and should not be changed. 

FrontEnd w/o Extension: 

USAGE: bin/AdvFrontEnd infile HTK_outfile [options] 
OPTIONS: - Same as FrontEnd w/ Extension 

Vector Quantization w/ Extension: 

Usage: extcoder htk_file_in pitch_file_in class_file_in bitstream_file_out pitch_file_out txt_file_out -freq x - 

VAD/No_VAD 
htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. 

pitch_file_in Input pitch period file. 
class_file_in Input classification file. 
bit_file_out Output binary bitstream. 

pitch_file_out Output quantised pitch period file. 
txt_file_out Vector quantiser output in text format, 

-freq x Sampling frequency in kHz (8 or 16). 

-VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but 

extension .vad 
-No_VAD Do not incorporate voice activity detector information in output bitstream. 

Vector Quantization w/o Extension: 

Usage: coder htk_file_in bitstream_file_out txt_file_out -freq x -VAD/No_VAD 

htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. 

bit_file_out Binary output bitstream. 

txt_file_out Vector quantiser output in text format. 

-freq x Sampling frequency in kHz (8 or 16). 

-VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but 

extension .vad 
-No_VAD Do not incorporate voice activity detector information in output bitstream. 

File extension descriptions as generated by the sample script: 

.cep - Binary file containing cepstral features in HTK format. Output from the FrontEnd, input to the vector quantizer, 
.pitch - Binary file containing pitch information. Output from the FrontEnd, input to the vector quantizer. Only used for 

Extension, 
.class - Ascii file containing class information. Output from the FrontEnd, input to the vector quantizer. Only used for 

Extension, 
.bs - Binary file containing the bitstream. Output from the vector quantizer, 
.log - Log files from the different executables. 



ETSI 



3GPPTS 26.243 version 6.1.0 Release 6 7 ETSI TS 126 243 V6.1.0 (2004-12) 

4.3 Code hierarchy 

Tables 1 to 3 are call graphs that show the functions used for AFE (table 1), VQ (table 2), and Extension (table 3). 

Each column represents a call level and each cell a function. The functions contain calls to the functions in rightwards 
neighboring cells. The time order in the call graphs is from the top downwards as the processing of a frame advances. 
All standard C functions: printf(), fwrite(), etc. have been omitted. Also, no basic operations (add(), L_add(), mac(), 
etc.) or double precision extended operations (e.g. L_Extract()) appear in the graphs. 

The basic operations are not counted as extending the depth, therefore the deepest level in this software is level 7. 
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Table 1 : AFE call structure 



|main() 



AdvProcesslnit_B() 



DoNoiseSuplnit_B() 



DoWaveProclnit_BQ 



DoCompCepslnit_B() 



DoPostProclnit_B() 



DoVADInit_F() 



Do16kProclnit_B() 



QMF_FIR_lnit_B() 



| AdvProcessAlloc_BQ~ 



| FlushAdvProcessBQ - 



|AdvProcessDelete_B()~ 



| DoAdvProcess_B() 



firinitializationBQ 



DP_HP_filters_B() 



Butln32AllocQ 



DoNoiseSupAlloc_B() 



DoWaveProcAlloc_B() 



DoCompCepsAlloc_B() 
DoPostProcAlloc_B() 



DoVADAIIoc_F() 



Do16kProcAlloc_BQ 



DoVADFIush_F() 



CvFeatlnt2Float() 



DoNoiseSupDelete_B() 



DoWaveProcDelete_BQ 
DoCompCepsDelete_B() 
DoPostProcDelete_B() 
DoVADDelete_B() 



Bufln32Free() 



Do1 6kProcessing_BQ 



DoNoiseSup_B() 



Get1 6k_p_bufferData1 6k_B() 



Get1 6k_butData1 6kSize_B() 



Get16k_p_BandsForCoding16k_B() 



Get1 6k_p_GodeForBands1 6k_B() 



Get16k_dataHP_B() 



VAD_F() 



Log_2() 



DoSigWindowing16 F1() 




DoSigWindowing16 F2() 




ff4NRFix32 B() 






GetL15() 




GetH15() 




Mult16x32() 




Add Mult16x16 16() 




Sub Mult16x16 16() 




Permutf) 


FFTtoPSD F() 






Square24d2 B() 




Square24 B() 


Get16k BFC dec B{) 




GetBandsForCoding16k B() 




PSDMean F() 




NoiseEstimation F1 f) 






Sqrt 2() 




Sqrt16 2() 


NoiseEstimation F2() 






Sqrt 2() 




Sqrt16 2() 


FilterCalc F() 




SpeechQVar() 




FilterBank16() 




SpeechQSpec() 




SpeechQMel() 




DoGainFact F1 () 






Log 2() 


DoGainFact F2() 






Log 2() 


DoMellDCT F16() 




ApplyWF() 




Get16k ded() 




Get16k dec2() 




Get16k dec3() 




DoSigWindowing16 F3() 




ff4NRFix32 B() 






GetL15() 




GetH15() 




Mult16x32() 




Add Mult16x16 16() 




Sub Mult16x16 16() 




PermutO 


FFTtoPSD F() 






Square24d2 B() 




Square24 B() 


DoMelFB B() 




CodeBands16k B() 




DoSpecSub16k B{) 






Log 2() 


UpDateDecal() 




ApplyDecalf) 




DCOffsetFil F{) 




Get16k hpBandsSize B() 
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I DoWaveProc_B() 



| DoCompCeps_B() 



Get1 6k_p_hpBands_B() 



Get1 6k_p_bufferCodeForBands1 6k_B() 
Get1 6k_p_CodeForBands1 6k_B() 



Getl 6k_p_bufferCodeWeights_B() 



Get16k_p_codeWeights_B() 



Setl 6k_hpBands_dec_B() 



TeagerEngQ 



GetTeagerFilterQ 



CepsComputeQ 



DoPostProc_B() 



DoVADProc_F() 



focalpojntQ 



GetMaximaPositionsQ 



Getl 6k_p_bufferCodeWeights_B() 



Get1 6k_p_bufferGodeForBands1 6k_B() 
PreEmphHammQ 



ff4NB16_B() 



GetBandsForDecodingl 6k_B() 



DecodeBandsl 6k_B() 



FilterBankQ 



Getl 6k_hpBands_dec_B() 



Getl 6k_p_hpBands_B() 



MergeSSandCoded_B() 



CorrectEnergyBQ 



Coslnv16Khz() 



coslnvQ (gnly for 8kHz) 



Table 2: VQ call structure 



main() 



quantize_and_print() 



get_best_dataframe() 



quant_pitch_abs() 



get_class_bit() 



quant_pitch_diff() 



get_class_bit() 



mfcc_crc_encode() 



pc_crc_encode() 



best_centroid() 
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Table 3: Extension call structure 

|main() 



RVC_ConstructPitchRom_be() 



RVC_ConstructPitchMeter_be() 



RVC_DestructPitchRom_be() 



RVC_DestructPitchMeter_be() 



| DoAdvProcess_B() 



AllocateJnterpolated 
Dtt_be() 



RVC_ResetPitchMete 
r_be() 



Deallocate_lnterpolat 
edDft_be() 



Do Pitch ExtractQ 



FilterBankQ 



dsr_afe_vad() 



get_vm() 



IsLowBandNoiseQ 



get_zcm() 



pre_process() 



| RVC_MeasurePitch_be(j~ 



fnLog2() 



iir_d() 



iir_s() 



ClearPitch_be() 



Dirichletlnterpolationb 
eO 



l5LowLevellnput_be() 
Finalize_be{) 



IsContinuousPitc 
h_be() 



Mpy JwswQ 



PrepareSpectralPeaks_ 
beQ 



MpyJwswQ | 



CalcSpectrumb 
eO 



FindPeaksJjeQ 
PrelimScaleDow 
nAmpsOfHighFre 
qPeaks_be() 



qsort_be()* 



ComparelpointA 
mpbeQ 



RefineSpectraIPe 
aksJjeQ 



FindPitchCandidatesJ) 



FinaLScaleDown 
AmpsOfHighFreq 
Peaks_be() 



MpyJwswQ 



Mpy JwswQ 



Mpy JwswAddf 
} 



swapQ 



sqrtJJixQ 



NormalizeAmplitu 
des_beQ 



CalcUtilityFunctio 
n_be() 



CreatePieceWise 
ConstantFunction 
_be() 



qsort_beQ* 



Compare_ARRA 
Y_OF_XPOINTS 

beQ 



LinkArrayOfPoint 
s_be() 



AddSortedArrayO 
tPoints_be() 



FindDominantLoc 
alMaximalnUtility 
FunctionbeQ 



UtilityFunctionAt 
Given PitchFreq_ 
beQ 



qsort_beQ* 



Compare PitchFre 
qAscendingbeQ 



SelectTopPitchC 

andidates_beQ 



compute_pcorr_b 
eO 



ConvertLinkedLis 
tOfDiffPointsToUt 
iiFunc_be() 



L_Extract() 



Mpy_32_1 6() 



swapQ 



LinkArrayOfPoint 
s_be() 



Mpy_lw_sw() | 



swapQ 



Mpy_lw_sw() I 
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|interpolate_be() 



sqrtj_fix() 



find_most_energ 
eticwindowbeQ 



accumulate_be() 



|SelectFinalPitch_be() 



qsort_bet)* 



find_most_energ 
etic_window2_be 




Mpy JwswQ 



Mpy JwswQ 



MpylwJwQ 



ComparePitchFre 
qDescendingbef 
) 



ClearPitch_be() 



GOOD_ENOUG 
H_be() 



CLOSELY_LOCA 

TED_be() 



BETTER_be() 



IsContinuousPitc 
h_be() 



I classify_frame() 



CalculateDoubleWindo 
wDft_be() 



swapQ 



MpyJwswQ 



MpyJwswQ 



qsort_be() is a recursive function 



ETSI 



3GPPTS 26.243 version 6.1.0 Release 6 12 ETSI TS 126 243 V6.1.0 (2004-12) 



4.5 Variables, constants and tables 

The data types of variables and tables used in the fixed point implementation are signed integers in 2's complement 
representation, defined by: 

- Wordl6 16 bit variable; 

- Word32 32 bit variable. 
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4.5.1 Description of constants used in the C-code 



Table 5a: Global constants for AFE 



Constant 


Value 


Description 


NS SPEC ORDER 16K 


64 


Noise suppression Array length 


NS HANGOVER 16K 


15 


Noise suppression hangover count 


NS MIN SPEECH FRAME HANGOVER 16K 


4 


Noise suppression minmum speech frame hangover count 


NS ANALYSIS WINDOW 16K 


80 


Noise suppression analysis window 


PERC CODED 


0.7 


lambda merge (empirically set constant) 


LAMBDA NSE16k 


0.99 


Noise estimation Lambda 


NS NB FRAME THRESHOLD NSE 


100 


Noise suppression number of frame threshold used for NSE 


LENGTH QMF 


118 


QMF filter length 


f24 


1 


multiplier for QMF filter coefficients 


SHFF H 


8 


shift to get higher value 


L H 


16 


shift to get lower value 


HP16k MEL USED 


3 


Higher frequnecy band Mel used 


NB LP BANDS CODING 


3 


Lower frequency band used in coding 


NE16k FRAMES THRESH 


100 


Noise estimation frames threshold 


NB TOPOSTPROC 


12 


Number of coefficients to postprocess 


CEP FRAME LENGTH 


200 


Frame length for cepstral coefficients 


CEP NB COEF 


13 


Number of cepstral coefficients (including cO) 


CEP NB CHANNELS 


23 


Number of filters used for cepstral coefficients 


CEP FFT LENGTH 


256 


FFT length for cepstral coefficients 


FRAME BUF SIZE 


241 


Denoised Output buffer size 


FRAME SHIFT 


80 


WaveProcessing input frame shift 


FRAME LENGTH 


200 


WaveProcessing frame size 


NS SPEC ORDER 


65 


Noise suppression array length (8khz) 


NS BUFFER SIZE 


180 


Noise suppression past frame size 


NS FRAME SHIFT 


80 


Noise suppression input frame shift 


NS HALF FILTER LENGTH 


8 


Noise suppression filter half size 


NS NB FRAME THRESHOLD LTE 


10 


Noise suppression long term energy forgetting factor threshold (in frames) 


NS NB FRAME THRESHOLD NSE 


100 


Noise suppression spectrum estimate forgetting factor threshold (in frames) 


NS MIN FRAME 


10 


Number of frame threshold to update average energy for Nosie suppression VAD 


NS FFT LENGTH 


256 


FFT length for noise suppression 


WF MEL ORDER 


25 


Noise suppression Wiener filter order 


SHFT NOISE 


14 


shift applied to noise spectrum estimate 


SHFT FACT MUL 


14 


shift applied to gain coefficient (nosie suppression gain factoriization) 


IDCT ORDER 


25 


Noise suppression idct order 


NS BETA 


0.98 


Noiseless signal suppression factor 


NS RSB MIN 


0.079432823 


Minimum a priori SNR 


NS LAMBDA NSE 


0.99 


Forgetting factor for noise spectrum estimate 


NS LOG SPEC FLOOR 


-10.0 


average energy minimum threshold 


NS SNR THRESHOLD VAD 


15 


SNR threshold for noise suppression VAD 


NS SNR THRESHOLD UPD LTE 


20 


Long term energy update threshold for noise suppression VAD 


NS ENERGY FLOOR 


80 


Energy Minimum threshold for noise suppression VAD 


MaxPos 


10 


Maximum number of maxima in waveprocessing 


WP EPS 


0.2 


weigthing value added or substracted for waveprocessing 



Table 5b: Global constants for VQ 



Constant 


Value 


Description 


MIN PERIOD 


1245184 


Minimum pitch period allowed 


MAX PERIOD 


9175040 


Maximum pitch period allowed 


NUM MULTI LEVELS 1 


26 


number of levels in pitch quantization 


NUM MULTI LEVELS 2 


24 


number of levels in pitch quantization 


UNVOICED CODE 





init value for Qpindex 



Table 5c: Global constants for Extension 



Constant 


Value 


Description 


HISTORY LEN 


100 


History length - past samples for pitch extraction 


DOWN SAMP FACTOR 


4 


Down-sampling factor - used in computing correlation 


NO OF DFT POINTS 


128 


Number of DFT points 


BREAK POINT 


12 


Break point - marks the end of low frequency band 


LBN HIST WEIGHT 


32440 


Low band noise history weight 


LBN CURR WEIGHT 


328 


Low band noise current weight (32768 - LBN HIST WEIGHT) 


LBN MAX THR 


124518 


Low band noise maximum threshold 


LBN LOW ENR LEVEL MANT 


32000 


Low band noise low energy level mantissa 


LBN LOW ENR LEVEL SHFT 


22 


Low band noise low energy level shift 


RVC OK 





Return code for success 


RVC ERR 


-1 


Return code for unspecified error 


RVC ERR NOT ENOUGH MEMORY 


-2 


Return code for not enough memory 


RVC ERR ILLEGAL ARGUMENT 


-3 


Return code for an illegal input / output argument 


RVC ERR IO FAILED 


-4 


Return code for failed input/ output to a file 


RVC ERR BAD FILE FORMAT 


-5 


Return code for a bad file header 


RVC ERR NOT INITIALIZED 


-6 


Return code for failure due to improper initialization 


RVC ERR ILLEGAL USAGE 


-7 


Return code for illegal usage of a function 


RVC ERR NOT ENOUGH SAMPLES 


-8 


Return code for insufficient number of samples 


RVC_ERR_NOT_IMPLEMENTED 


-9 


Return code for an unimplemented function 
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RVC ERR FAIL OPEN FILE 


-10 


Return code for failure to open a file 


UB ENRG FRAC 


59 


Upper band energy fraction 


ZCM THLD 


87 j 


Zero crossing measure threshold 


SQRT ONE HALF 


0x5A82 


Square root of 0.5 (0.707) 


FRAME LEN DS 


50 


Frame length downsampled (200/4) 


FRAME LEN DS BY 2 


25 


Frame length downsampled divided by 2 


HISTORY LEN DS 


25 


History length downsampled (100/4) 


WINDOW LENGTH 


18 


Window length used in computing correlation 


INV WINDOW LENGTH 


1820 


Inverse of window length (1/18 = 0.05556) 


NUM CHAN 


23 


Number of channels or Mel-frequency bands 


MIN CH ENRG MANTISSA 


20000 


Minimum channel energy mantissa 


MIN CH ENRG SHIFT 


25 


Minimum channel energy shift 


INIT SIG ENRG MANTISSA 


30518 


Initial signal energy mantissa 


INIT SIG ENRG SHIFT 


8 


Initial signal energy shift 


CE SM FAC 


18022 


Channel energy smoothing factor 


CE SM FAC COMPL 


14746 


Channel energy smoothing factor complement 


CNE SM FAC 


3277 


Channel noise energy smoothing factor 


CNE SM FAC COMPL 


29491 


Channel noise energy smoothing factor complement 


LO GAMMA 


22938 


Low gamma value 


LO GAMMA COMPL 


9830 


Low gamma value complement 


HI GAMMA 


29491 


High gamma value 


HI GAMMA COMPL 


3277 


High gamma value complement 


LO BETA 


31130 


Low beta value 


HI BETA 


32702 


High beta value 


INIT FRAMES 


10 


Initial number of frames (considered to be noise frames) 


SINE START CHAN 


4 


Sine start channel (for sine wave detection) 


PEAK TO AVE THLD 


10 


Peak to average threshold 


DEV THLD 


1523942 


Deviation threshold 


HYSTER CNT THLD 


9 


Hysteresis count threshold 


F UPDATE CNT THLD 


500 


Forced update count threshold 


NON SPEECH THLD 


32 


Non-speech threshold 


FIX 34 


24576 


(short) (32768.0 * 3.0/4.0) 


FIX 18 


4096 


(short) (32768.0 * 1 .0/8.0) 


FIX INVSQRT2 


-23170 


1 / sqrt(2) 


swTHIRD REF BANDWIDTH 


85 


One third of the reference bandwidth 


swTWO THIRDS REF BANDWIDTH 


171 


Two thirds of the reference bandwidth 


MIN ENERGY MANTISSA 


25600 


Minimum energy mantissa 


MIN ENERGY SHIFT 


18 


Minimum energy shift 


swREF SAMPLE RATE Q0 


0x1 F40 


Reference sampling rate in Q0 format 


swCLOSE FACTOR Q14 


0x4CCD 


Closeness factor in Q1 4 format 


swFD SCORE THLD1 Q15 


0x63D7 


Frequency domain score threshold 1 in Q1 5 format 


swFD SCORE THLD2 Q15 


0x570A 


Frequency domain score threshold 2 in Q1 5 format 


swCORR THLD Q15 


0x651 F 


Correlation threshold in Q1 5 format 


swSUM THLD Q14 


0x6667 


Sum threshold in Q14 format 


IwCRITO OFFSET Q15 


0x00001 70A 


Offset for finding a better pitch candidate in Q1 5 format 


swCANDCORR THLD1 Q15 


0x799A 


Pitch candidate correlation threshold 1 in Q1 5 format 


swCANDCORR THLD2 Q15 


0x599A 


Pitch candidate correlation threshold 2 in Q1 5 format 


swCANDCORR THLD3 Q15 


0x6CCD 


Pitch candidate correlation threshold 3 in Q15 format 


swCANDAMP THLD3 Q15 


0x68F6 


Pitch candidate amplitude threshold 3 in Q1 5 format 


swSTARTFREQ COEFF 


0x553F 


Start frequency coefficient (for candidate search) 


swENDFREQ COEFF 


0x4666 


End frequency coefficient (for candidate search) 


DIRICHLET KERNEL SPAN 


8 


Direchlet kernal span (for interpolation) 


REF SAMPLE RATE 


8000 


Reference sampling rate 


REF BANDWIDTH 


4000 


Reference bandwidth 


IwTHIRD REF BANDWIDTH 


87381333 


One third of the reference bandwidth 


IwTWO THIRDS REF BANDWIDTH 


174762667 


Two thirds of the reference bandwidth 


swCENTER WEIGHT 


0x5000 


Center weight 


swSIDE WEIGHT 


0x1800 


Side weight 


swAMP SCALE DOWN1 


0x5333 


Amplitude scale down factor 1 


SWAMP SCALE DOWN2 


0x399A 


Amplitude scale down factor 2 


swAMP SCALE DOWN2b 


0x7333 


Amplitude scale down factor 2b 


SWUDIST1 


-4160 


Utility function distance 1 


SWUDIST2 


-6400 


Utility function distance 2 


swUSTEP 


-16384 


Utility function step 


swFREQ MARGIN1 


0x4AE1 


Frequency margin 1 


swAMP MARGIN1 


0x07AE 


Amplitude margin 1 


swAMP MARGIN2 


0x07AE 


Amplitude margin 2 


MIN STABLE FRAMES 


6 


Minimum number of stable frames 


MAX TRACK GAP FRAMES 


2 


Maximum pitch track gap frames 


swSTABLE FREQ UPPER MARGIN 


0x4E14 


Stable frequency upper margin 


swSTABLE FREQ LOWER MARGIN 


0x68EB 


Stable frequency lower margin 


UNVOICED 





Pitch frequency of an unvoiced frame 


IwMAX PITCH FREQ 


0x01A40000L 


Maximum pitch frequency 


IwMIN PITCH FREQ 


0x00340000L 


Minimum pitch frequency 


MAX PITCH FREQ 


420 


Maximum pitch frequency in Hz 


MIN PITCH FREQ 


52 


Minimum pitch frequency in Hz 


HIGHPASS CUTOFF FREQ 


300 


Highpass cut-off frequency in Hz 


NO OF FRACS 


77 


Number of fractions in the frations table 


IwSHORT WIN START FREQ 


0x00C80000L 


Short window start frequency 


IwSHORT WIN END FREQ 


0x01A40000 


Short window end frequency 


IwSINGLE WIN START FREQ 


0x00640000L 


Single window start frequency 


IwSINGLE WIN END FREQ 


0x00D20000L 


Single window end frequency 


IwDOUBLE WIN START FREQ 


0x00340000 


Double window start frequency 


IwDOUBLE WIN END FREQ 


0x00780000L 


Double window end frequency 


MAX LOCAL MAXIMA ON SPECTRUM 


70 


Maximum number of local maxima on the spectrum 


MAX PEAKS FOR SORT 


30 


Maximum number peaks for sorting 


MAX PEAKS PRELIM 


7 


Maximum number of peaks (preliminary) 


MIN PEAKS 


7 


Minimum number of peaks 


MAX PEAKS FINAL 


20 


Maximum number of peaks (final) 


MAX PRELIM CANDS 


4 


Maximum number of preliminary candidates (pitch) 


CREATE PIECEWISE FUNC LOOP LIM SH 


20 


Create Piecewise function loop limit for short window 


CREATE PIECEWISE FUNC LOOP LIM SNG 


30 


Create Piecewise function loop limit for single window 


CREATE PIECEWISE FUNC LOOP LIM DBL 


60 


Create Piecewise function loop limit for double window 
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swSUM FRACTION 


0x799A 


Sum fraction 


SWAMP FRACTION 


0x33F8 


Amplitude fraction 


MAX BEST CANDS 


2 


Maximum number of best candidates (pitch) 


N OF BEST CANDS SHORT 


2 


Number of best candidates for short window 


N OF BEST CANDS SINGLE 


2 


Number of best candidates for single window 


N OF BEST CANDS DOUBLE 


2 


Number of best candidates for double window 


N OF BEST CANDS 


6 


Number of best candidates for all windows 


SIZE_SCRATCH_DOPITCH 


1090 


Scratch memory size for DoPitch() function (This is the actual size required. The 
declared size in C simulation is 1632) 


SIZE_SCRATCH_ADVPROCESS 


825 


Scratch memory size for DoAdvProcessO function (This is the actual size required. 
The declared size in C simulation is 1 1 00) 


RVC PITCH ROM SIG 


11031 


Signature for RVC PITCH ROM structure 


RVC PITCH METER SIG 


21053 


Signature for RVC_PITCH_METER structure 
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4.5.2 Description of fixed tables used in the C-code 

This section contains a listing of all fixed tables sorted by source file name and table name. All table data is declared as 
Wordl6. 

Table 6a: Fixed tables for AFE 



File 


Table Name 


Length 


Description 


16kHzProcessing B.c 


table pow2 


33 


Table for square root 




LambdaNSEx2 


100 


Table used to compute first 1 00 LambdaNSE 




dp02 h 


59 


MSB of QMF filter coefficients 




dp02 I 


43 


LSB of QMF filter coefficients 


PostProc B.c 


targetLMS16 


12 


Target for blind equalization 


ComCeps B.c 


HalfHamming16 


100 


Hamming window coefficients 




CosMatrix16 


144 


Inverse cosinus coefficients at 8Khz (not used at 16khz) 




CosMatrix16 16khz 


156 


Inverse cosinus coefficients at 1 6Khz 




pondMelFilter 


309 


Mel bank coefficients 


ff4nrFix16 B.c 


tabSin 


64 


Sine table 




tabCos 


64 


Cosine table 


MathFunc.c 


tblntO 


48 


Coefficients for computation of square root 


ExtNoiseSup B.c 


lambda 1divX 


20 


Computation of 1/N 




Hann sh32 hi 


100 


MSB of hanning window coefficients (32 bits) 




Hann sh32 lo 


100 


LSB of hanning window coefficients (32 bits) 




Hann sh24 hi 


100 


MSB of hanning window coefficients (24 bits) 




Hann sh24 lo 


100 


LSB of hanning window coefficients (24 bits) 




pondMelFilterNoise 


157 


Mel-frequency scale coefficients (applied to the Wiener filter} 




idctMel16 


234 


Mel-warped inverse DCT coefficients 




pondMelFilter16k 


134 


Filter bank coefficients at 1 6Khz 




M1 LamdaLTE 


8 


Computation of 1/N 




M1 LambdaNSEx2 


100 


Computation of 2/N 




M1 LamdaNSE 


9 


Computation of 1/N 




mlnvLambda16 


10 


Comutation od 2/N 



Table 6b: Fixed tables for VQ 



File 


Table Name 


Length 


Description 


coder VAD.c 


quantizer16kHz 1 


128 


vq table 




quantized 6kHz 2 3 


128 


vq table 




quantized 6kHz 4 5 


128 


vq table 




quantized 6kHz 6 7 


128 


vq table 




quantized 6kHz 8 9 


128 


vq table 




quantized 6kHz 10 11 


64 


vq table 




quantized 6kHz 12 13 


512 


vq table 




quantizer8kHz 1 


128 


vq table 




quantizer8kHz 2 3 


128 


vq table 




quantizer8kHz 4 5 


128 


vq table 




quantizer8kHz 6 7 


128 


vq table 




quantizer8kHz 8 9 


128 


vq table 




quantizer8kHz 10 11 


64 


vq table 




quantizer8kHz 12 13 


512 


vq table 




weight16kHz cO shift 




vq weights 




weight16kHz cO norm 




vq weights 




weight16kHz logE 




vq weights 




weight8kHz cO shift 




vq weights 




weight8kHz cO norm 




vq weights 




weight8kHz logE 




vq weights 




plwQuantl_evels[1 27] 


127*2 


vq tables for pitch/class quantization 




ppplwQuantSections[8][3] 


24*2 


vq tables for pitch/class quantization 




plwQuantl_evels[31] 


31*2 


vq tables for pitch/class quantization 




pplwQuantSections[4][3] 


12*2 


vq tables for pitch/class quantization 




pswRatioThld 1[4][6] 


24 


vq tables for pitch/class quantization 




piMuIti Level lndex[4] 


4 


vq tables for pitch/class quantization 




pswRatioThld 2[4][8] 


32 


vq tables for pitch/class quantization 




piMultiLevellndex 2[4] 


4 


vq tables for pitch/class quantization 




swAlphal 


1 


pitch/class constants 




swAlpha2 


1 


pitch/class constants 
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Table 6c: Fixed Tables for Extension 



File 


Table name 


Length 


Description 


ExtNoiseSup B.c 


pswPePower 


129 


Coefficients to compute the pre-emphasis power spectrum 


preProc B.c 


pswHpfCoef 


15 


High pass filter coefficients 


preProc B.c 


pswLpfCoef 


15 


Low pass filter coefficients 


preProc B.c 


pswLfeCoef 


3 


Low frequency emphasis filter coefficients 


dsrAfeVad B.c 


piBurstConst 


20 


Burst length constants for different SNR's 


dsrAfeVad B.c 


piHangConst 


20 


Hang length constants for different SNR's 


dsrAfeVad B.c 


piVADThld 


20 


VAD voice metric thresholds for different SNR's 


dsrAfeVad B.c 


piVMTable 


90 


Voice metric table as a function of SNR index 


dsrAfeVad B.c 


piSigThld 


20 


Signal threshold table as a function of SNR 


dsrAfeVad B.c 


piUpdateThld 


20 


Update threshold table as a function of SNR 


dsrAfeVad B.c 


pswShapeTable 


23 


Spectral shape correction table 


fix mathlib.c 


coeff sqrt5 58 


5 


Coefficients for computation of square root 


fix mathlib.c 


coeff sqrt5 78 


5 


Coefficients for computation of square root 


rvc pitch init B.h 


ROM astFrac 


312 


Fractions table 


rvc pitch init B.h 


ROM pstWindowshiftTable 


514 


Complex exponents table for time shifting in frequency domain 


rvc_pitch_init_B.h 


ROM_aswDirichletlmag 


8 


Imaginary part of the Dirichlet kernel 



4.5.3 Static variables used in the C-code 

In this section two tables that specify the static variables for the AFE, VQ, and Extension respectively are shown. 
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Table 7a: AFE static variables 



Struct Name 


Variable 


Type[Length] 


Description 


QMF FIR 










lengthQMF 


Word32 


QMF Filter length 




*dp I 


Word 16 


QMF filter low frequency Coeff 




*dp h 


Word 16 


QMF filter high frequency Coeff 




*T 


Word 16 


Temporary QMF filter buffer 




T dec 


Word 16 


Multiplier for T 


DataFor16kProc B 










FrameLength 


Word32 


Input Frame length 




FrameShift 


Word32 


Shift value for the frame 




numFrameslnBuffer 


Word32 


Number of frames in buffer 




SamplingFrequency 


Word32 


Sampling frequency (8/16) 




Do16kHzProc 


BOOLEAN 


Flag to enable 16kHz processing 




'hpBands B 


Word32 


Buffer for HP bands 




hpBandsSize 


Word32 


hpBands B buffer size 




CodeForBands16k B 


Word32[9] 


HP coding buffer 




bufferCodeForBands16k B 


Word32[27] 


buffer used for HP coding 




codeWeights B 


Word16[3] 


code Weights buffer 




bufferCodeWeights B 


Word16[9] 


buffer used for code Weights 




*pQMF Fir 


QMF FIR 


Pointer to QMF FIR structure 




*bufferData16k B 


Word32 


temporary buffer to carry QMF LP data 




bufDatal 6kSize 


Word32 j 


1 6k data buffer size 




"FirstWindowl 6k 


MelFB Window 


pointer to MelFB Window structure 




noiseSE16k B 


Word32[3] 


noise spectrul energy variable 




noise dec 


Word 16 


Multiplier for noiseSEI 6k B 




BandsForCoding16k B 


Word32[9] 


buffer for storing Bands for Coding 




vadCounter16k 


Word32 


vad flag counter 




vad16k 


Word32 


vad flag 




nbSpeechFramesI 6k 


Word32 


number of speech frames counter 




hangOveii 6k 


Word32 


hang over used for VAD 




meanEn16k 


Word32 


mean Energy variable 




nb frame threshold nse 


Word32 


threshold NSE for frame 




lambda nse 


Word 16 


lambda NSE variable 




•dataHP B 


Word32 


buffer stores QMF HP value 




dec 16k 


Word16[5] 


Multiplier for dataHP B buffer 




BFC dec 


Word16[1] 


Multiplier for computing bands for coding 




fb16k dec 


Word16[3] 


Buffer is used to store multiplier for current and pervious two frames 


PostProcStructX 










weightLMS 


Word32[1 2] 


Current LMS weight 


CompCepsStructX 










FFTLength 


Word32 


FFT size 




Do16khzProc 


Word 16 


Flag to enable 16kHz processing 




*pData16k 


Word32 


Pointer to data for 1 6Khz processing 


WaveProcStructX 










*TeagerFilter16 


Word32 


Pointer to teager filter 




*TeagerWindow32 


Word32 


Pointer to teager window 




TeagerOnset 


Word32 


Unused 




FrameLength 


Word32 


Input frame length 


ns var F 










SampFreq 


Word 16 


Sampling frequency (8/16) 




Do16khzProc 


Word 16 


Flag to enable 16kHz processing 




buffers. nbFrameslnFirstStage 


Word32 


number of frames in first stage 




buffers. nbFrameslnFirstStage 


Word32 


number of frames in second stage 




buffers. nbFramesOutSecondStage 


Word32 


number of frames out og second stage 




buffers. FirstStageln16Buffer 


Word16[180] 


First stage buffer 




buffers. SecondStagelnBuffer32 


Word32[1 80] 


Second stage buffer 




buffers. SecondDecalSig 


Word16[4] 


Shift factor for each sub-frame of second stage buffer 




prevSamples32.lastSampleln32 


Word32 


Last input sample of DC offset compensation 




prevSamples32.lastDCOut32 


Word32 


last output sample of DC offset compensation 




prevSamples32. oldShift 


Word 16 


Iprevious window shift factor of DC offset compensation 




spectrum. indexBufferl 


Word 16 


Where to enter new PSD for first stage, alternatively and 1 




spectrum. indexBuffer2 


Word 16 


Where to enter new PSD for second stage, alternatively and 1 




spectrum. noiseSEI 32 


Word32[65] 


Noise spectrum estimate for first stage 




spectrum. noiseSEI dec 


Word16[65] 


Shift factor for Noise spectrum estimate (first sage) 




spectrum. noiseSE2 32 


Word32[65] 


Noise spectrum estimate for second stage 




spectrum. noiseSE2 dec 


Word16[65] 


Shift factor for Noise spectrum estimate (second sage) 




spectrum. PSDMeanAntBufferl 


Word32[65] 


1 st stage PSD Mean buffer for precedent frame 




spectrum. nSigSEI Ant dec 


Word16[65] 


Shift factor for PSD Mean buffer for precedent frame (1 rst stage) 




spectrum. PSDMeanAntBuffer2 


Word32[65] 


2nd stage PSD Mean bufferfor precedent frame 




spectrum. nSigSE2Ant dec 


Word16[65] 


Shift factor for PSD Mean buffer for precedent frame (2nd stage) 




spectrum. denSigSEI 32 


Word32[65] 


1st stage PSD Mean buffer 




spectrum. nSigSEI Cur dec 


Word16[65] 


Shift factor for PSD Mean buffer (1 rst stage) 




spectrum. denSigSE2 32 


Word32[65] 


2nd stage PSD Mean buffer 




spectrum. nSigSE2Cur dec 


Word16[65] 


Shift factor for PSD Mean buffer (2 nti stage) 




vad data ns F. nbFrame 


Word16[2] 


Nubmer of frames (for the 2 stages) 




vad data ns F. flagVAD 


Word 16 


Vad Flag (1 = SPEECH, = NON SPEECH) 




vad data ns F.hangOver 


Word 16 


hangover 




vad data ns F. nbSpeechFrames 


Word 16 


Number of speech frames (used to set hangover) 




vad data ns F.meanEn32 


Word32 


Mean energy for VAD 




vad data ca. flagVAD 


Word16 


Vad Flag (1 = SPEECH, = NON SPEECH) 




vad data ca.hangOver 


Word 16 


hangover 




vad data ca. nbSpeechFrames 


Word 16 


Number of speech frames (used to set hangover) 




vad data ca.meanEn32 


Word32 


Mean energy for VAD 




vad data fd.MelMean 


Word 16 


SpeechQMel (for frame dropping) 




vad data fd.VarMean 


Word32 


SpeechQVar (for frame dropping) 
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vad data fd.AccTest 


Word32 


SpeechQSpec (for frame dropping) 




vad data fd.AccTest2 


Word32 






vad data fd.SpecMean 


Word32 


SpecMean (for frame dropping} 




vad data fd.MelValues 


Word16[2] 


SpeechQMel (for frame dropping) 




vad data fd.SpecValues 


Word32 


SpeechQSpec (for frame dropping) 




vad data fd.SpeechlnVADQ 


Word 16 


Flag (for frame dropping) 




vad data fd.SpeechlnVADQ2 


Word 16 


Flag (for frame dropping) 




gainFact.log Den En1 32 


Word32[3] 


Denoise frame energy for gain factorization 




gainFact.lowSNRtrack32 


Word32 


Low SNR level for gain factorization 




gainFact. alfaGF16 


Word 16 


Wiener filter gain factorization coefficient 


VADStructX F 










Focus 


Word 16 


Position of circular buffe 




HangOver 


Word 16 


Hangover length 




FlushFocus 


Word 16 


Position in circular buffer when emptying at end 




H CountDown 


Word 16 


Main hangover countdown 




V CountDown 


Word 16 


Short hangover countdown 




"OutBuffer 


Word32 


outBuffer pointer pointer 




'OutBuffer 


Word32[7] 


outBuffer pointer 




OutBuffer 


Word16[7x15] 


outBuffer 



Table 7b: VQ static variables 



Struct Name 


Variable 


Type [Length] 


Description 


coder VAD.c 


four frames[27] 


Word16[27] 


Previous frames used to build multiframe 




plwQPHistory[3] 


Word32[3] 


History of Pitch 




IReliableFlag 


Word 16 


Pitch reliability flag 
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Table 7c: Extension static variables 



Struct Name 


Variable 


Type[Length] 


Description 




iFirstFrameFlag 


Word16 


First frame flag 




pswUBSpeech 


Word16[200] 


Upper band speech 




pswDownSampledProcSpeech 


Word16[75] 


Down-sampled processed speech 




IwCritMax 


Word32 


Maximum power ratio 




iOldPitchPeriod 


Word16 


Old pitch period value 




iOldFrameNo 


Word 16 


Old frame number 


PCORR STATE be 


s be 








lwX1 X1 


Word32 


X1*X1 




lwZ1 Z1 


Word32 


Z1*Z1 




lwZ2 Z2 


Word32 


Z2*Z2 




lwX1 Z1 


Word32 


X1*Z1 




lwX1 Z2 


Word32 


X1*Z2 




lwZ1 Z2 


Word32 


Z1*Z2 




swX1 Sum 


Word16 


Sum of X1 




swZ1 Sum 


Word16 


SumofZI 




swZ2 Sum 


Word16 


SumofZ2 




iBurstConst 


Word16 


Burst constant 




iBurstCount 


Word16 


Burst count 




iHangConst 


Word16 


Hang constant 




iHangCount 


Word16 


Hang count 




iVADThld 


Word16 


VAD threshold 




iFrameCount 


Word16 


Frame count 




iFUpdateFlag 


Word16 


Forced update flag 




iHysterCount 


Word16 


Hysteresis count 




JLastUpdateCount 


Word16 


Last update count 




iSigThld 


Word16 


Signal threshold 




iUpdateCount 


Word16 


Update count 




iChanEnrg Shift 


Word16 


Channel energy shift 




iChanNoiseEnrgShift 


Word 16 


Channel noise energy shift 




pswChanEnrg 


Word16[23] 


Channel energy 




pswChanNoiseEnrg 


Word16[23] 


Channel noise energy 




swBeta 


Word16 


Beta value 




swSnr 


Word16 


SNR value 


NormSw 


pnsLogSpecEnrgLong 








swMantissa 


Word16[23] 


Mantissa 




iShift 


Word16[23] 


Shift 




swCO 


Word16 


CO value 




swC1 


Word16 


C1 value 




swC2 


Word16 


C2 value 




pswHpfXState 


Word16[6] 


High pass filter input state 




pswHpfYState 


Word16[12] 


High pass filter output state 




pswLpfXState 


Word16[6] 


Low pass filter input state 




pswLpfYState 


Word16[12] 


Low pass filter output state 




pswLfeXState 


Word16 


Low frequency emphasis filter input state 




pswLfeYState 


Word16[2] 


Low frequency emphasis filter output state 
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5 File formats 

This section describes the file formats used by the APE, VQ & Extension programs. 



5.1 Speech file 



Speech files read by the X-AEE and written by the Extension consist of 16-bit words. The byte order depends on the 
host architecture (e.g. MSByte first on SUN workstations, LSByte first on PCs etc) 
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Annex A (informative); 
Change history 



Change history 


Date 


TSG# 


TSG Doc. 


CR 


Rev 


Subject/Comment 


Old 


New 


2004-06 


24 


SP-040343 






Version 6.0.0 approved at 3GPP TSG SA#24 


2.0.0 


6.0.0 


2004-12 


26 


SP-040837 


001 


1 


Software bug correction: Removal of Basicops simulation 
of "C" shift operator 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


002 


1 


Software bug correction: Initialization of the variables Iwc 
and i2aScale 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


003 


1 


Software bug correction: Wrong assignment of the 
variables *piReliableFlag and *pcQPIndex 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


004 


2 


Software bug correction: Use of incorrect variable 
fRefPeriod instead of iRefPeriod 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


005 




Add reference to test sequences document 


6.0.0 


6.1.0 
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